PNAS Nexus
◐ Oxford University Press (OUP)
Preprints posted in the last 7 days, ranked by how well they match PNAS Nexus's content profile, based on 147 papers previously published here. The average preprint has a 0.09% match score for this journal, so anything above that is already an above-average fit.
Corona-Moreno, R.; Acuna-Zegarra, M. A.; Santana-Cibrian, M.; Velasco-Hernandez, J. X.
Show abstract
During the COVID-19 pandemic, limited testing capacity and reporting delays complicated epidemic surveillance and decision-making in Mexico. We calibrated \textit{covidestim}, a Bayesian nowcasting model, to estimate the total SARS-CoV-2 infections from reported cases and deaths using Mexican surveillance data. Disease-progression distribution priors were calibrated using Mexico City records and validated through comparisons with national seroprevalence surveys, hospitalization data, and annual reported severe-case rates across all states. Using the reconstructed estimates of active infections, we implemented an event-based risk framework that quantifies the probability of encountering at least one infectious individual in gatherings of different sizes. This probability was subsequently translated into a four-level epidemiological traffic-light indicator and computed at both state and municipality levels. The resulting estimates revealed substantial spatial heterogeneity that is obscured by state-level aggregation, particularly in states with marked differences between urban and rural municipalities. To evaluate consistency with public-health indicators, we compared the proposed risk classification with the official Mexican epidemiological traffic-light system, considering interpretable gathering sizes relevant to public-health decision making. Weekly reports derived from this framework were delivered to policymakers in the State of Queretaro in Mexico, as an anticipation tool for school reopening and public-space management. This demonstrates that this Bayesian reconstruction of infections combined with event-based risk metrics can provide an interpretable and generalizable municipality-level complement to routine surveillance systems, particularly in regions with limited testing capacity and heterogeneous local transmission dynamics.
Twohig, K. C.; Mansour, M.; Pugar, J. A.; Yuan, K.; Pocivavsek, L.; Klishin, A. A.
Show abstract
Biological systems evolve as continuous dynamical processes, but at organ-scale and across human lifespans they are rarely observed longitudinally--population data typically exist instead as sparse, cross-sectional snapshots. Inferring lifespan dynamics from such data requires methods distinct from those used at cellular and tissue scales where dense observations are accessible. We address this problem in the thoracic aorta, where surgical decisions currently rest on static, age- and sex-agnostic diameter thresholds that reduce three-dimensional morphology to a single scalar. Treating normal aortic morphology as a stochastic dynamical system, we pose a continuous-time drift-diffusion process in a two-coordinate state space of normalized surface area (A) and normalized fluctuation in integrated Gaussian curvature ({delta} K), and fit closed-form solutions of the Fokker-Planck equation by maximum likelihood to a sex-balanced, age-uniform cohort spanning infancy to age 99. Inter-individual variability is treated as a fitted diffusion parameter rather than as residual scatter, which is distinct from prior normative studies that report variability as scatter around a regression line. The framework identifies two growth regimes for aortic size (childhood expansion followed by persistent adult growth, with adult males growing approximately 70% faster than adult females) and a single dynamical regime for aortic shape, with heteroscedastic variability accumulating at a rate comparable to the mean drift over the lifespan. Applied to independent cohorts of acute and chronic thoracic aortic dissections, the multivariate model identifies over 95% as statistical outliers via Mahalanobis distance, consistently outperforming either coordinate alone. The same probabilistic envelope that describes normal aging thus defines a baseline against which disease can be detected, supporting a shift toward dynamic, age- and sex-aware assessment of thoracic aortic pathology.
Zheng, Y.; Feng, B.; Cheng, R.; Qiu, C.; Long, Z.; Vaziri, K.; Hahn, J.
Show abstract
Accurate assessment of body composition is important to risk stratification and management of metabolic, musculoskeletal, and aging-related diseases, yet reference modalities such as Dual-energy X-ray absorptiometry (DXA) are costly and impractical for frequent monitoring. Commodity 3D body scans offer a low-cost, radiation-free alternative, but extracting meaningful and predictive shape features from scans remains challenging due to nonuniform point density, variable body size and cross-device differences. We introduce BodyMAE, a self-supervised, surface-area aware masked autoencoder for metric-scale 3D body scans. The pipeline integrates area-adjusted sampling, a long-range focused encoder, and a lightweight decoder regularized to promote locally uniform reconstructions. Trained and evaluated on 917 paired 3D body scans paired with clinical DXA reports, BodyMAE achieves strong accuracy on fat percentage (root-mean-square error (RMSE) 3.825 percentage points, R^2 0.908), fat mass (RMSE 3.694 kg, R^2 0.968), and lean mass (RMSE 3.608 kg, R^2 0.901), with competitive performance on bone mineral content (RMSE 0.284 kg, R^2 0.754).We also assess feature stability across pretrained baselines, finding higher retrieval accuracy for our representations (Top-1 90.131%). These results indicate that combining metric-aware sampling, long-range relational encoding, and local geometric regularization enables accurate body composition estimation from 3D body scans, as validated by comparisons to DXA-derived measurements.
Biswas, M. A.; Laila, A.
Show abstract
Background: Machine learning models trained on population health surveys offer scalable tools for cardiovascular screening, but recurring methodological weaknesses undermine their credibility and equity: data leakage from synthetic oversampling, qualitative rather than quantitative explainability evaluation, and the absence of demographic fairness auditing at the clinical operating threshold. Methods: We present EXHEART, a leakage-free stacked ensemble pipeline trained on BRFSS 2015 (n = 253,680) and validated on BRFSS 2020 (n = 319,795; temporal transport and retrain) and a clinical cardiovascular examination dataset (n = 68,730). The pipeline combines XGBoost, LightGBM, Random Forest, and a multi-layer perceptron as base learners with 5-fold out-of-fold logistic regression stacking and Platt scaling calibration. A quantitative SHAP-LIME consistency framework, based on Kendall-tau rank correlation and Jaccard overlap, accompanies a decision-curve analysis, a subgroup-stratified SHAP interaction analysis, and an intersectional fairness audit (Sex x Age x Income) with threshold-shifting mitigation and a frontier of the fairness-utility trade-off. The framework also adds cross-instrument fairness-disparity attribution, an empirical diagnostic that provides evidence on whether an observed subgroup disparity is more consistent with a measurement-induced or a substantive explanation by re-validating it on a dataset that measures the same clinical construct objectively. On heart disease, this diagnostic associates 89% of the sex TPR gap (95% CI [0.65, 0.99]) with the self-reported survey outcome rather than with a substantive risk difference. Results: On BRFSS 2015, EXHEART achieves AUC-ROC = 0.850, AUPRC = 0.371, Brier score = 0.071, and reduces ECE by 96% (0.256 to 0.011) via Platt scaling. Global SHAP-LIME rank agreement is moderate-to-strong (Kendall-tau = 0.580, Spearman-rho = 0.818) with a substantial top-3 divergence (Jaccard@3 = 0.200), where Stroke flips from SHAP rank 8 to LIME rank 1. The Sex TPR gap is 0.124 at the screening threshold; intersectional Sex x Age disparities reach 0.649 among adequately-powered cells, 5.2x the single-attribute gap. Temporal transport to BRFSS 2020 collapses sensitivity from 0.776 to 0.267, while retraining restores AUC = 0.840 and ECE = 0.012. On clinical examination data, the Sex TPR gap collapses to 0.014; the attribution test indicates this gap is instrument-dependent, consistent with a measurement or outcome-definition explanation rather than a substantive risk difference. Cross-domain SHAP analysis identifies four instrument-independent CVD risk factors and two major portability failures. Conclusions: EXHEART combines three practices that population-scale cardiovascular classifiers usually apply in isolation: leakage-free training with calibrated probabilities, a test of whether the model's explanations are stable, and a fairness audit that examines intersecting subgroups rather than single attributes. Bringing them together proved worthwhile. The intersectional audit revealed disparities that single-attribute auditing missed, and the cross-instrument comparison indicated that much of the sex gap reflects how the outcome is measured in survey data rather than a substantive difference in risk. The temporal transport findings indicate that deployed BRFSS models warrant periodic monitoring and retraining to maintain clinical utility. EXHEART is a retrospective methodological evaluation on public de-identified data; it is not validated for direct clinical decision-making, diagnosis, or treatment recommendation without prospective clinical validation.
Gong, L.; Aswani, N.; Shahinian, P.; Yang, J. Y.; Kontos, D.; Manji, G.; Kang, S.; Hur, C.
Show abstract
Electronic health record (EHR) prediction models often summarize longitudinal histories as static patient-level features, which may omit potentially informative event ordering. We developed a simplified spike-timing-dependent plasticity (STDP)-inspired framework that represents asynchronous EHR data as sparse, directional transition features. The approach encodes whether one clinical event precedes another within prespecified temporal windows, preserving event identity, directionality, and approximate timing while retaining feature-level interpretability. We evaluated this framework in two retrospective prediction tasks with different temporal scales: incident acute kidney injury (AKI) prediction in 17,351 MIMIC-IV ICU stays and early postoperative recurrence prediction in 713 CUMC patients with pancreatic ductal adenocarcinoma (PDAC). Models were compared with static burden features (demographics, comorbidities, raw lab measurements) and in addition with STDP transitional feature sets using patient-level cross-validation and rolling prediction horizons. In AKI, a calibrated STDP ensemble model showed higher discrimination than static burden alone at the 24-hour decision snapshot for AKI by 72 hours, with AUROC 0.838 versus 0.800, and at 48 hours for near-term AKI prediction, with AUROC 0.868 versus 0.827. In PDAC, STDP transition features modestly improved Day -30 preoperative recurrence prediction, with AUROC 0.611 versus 0.587 and AUPRC 0.323 versus 0.318 for static burden and showed similar performance at Day 0 (7 days before recorded surgery date), with AUROC 0.681 and AUPRC 0.363. Decision-curve and feature analyses suggested that selected temporal transitions were clinically interpretable across renal, inflammatory, hepatobiliary, hematologic, glycemic, and nutritional trajectories. These findings suggest that STDP-inspired transition features may provide a practical, interpretable way to incorporate temporal ordering into EHR-based risk prediction across both acute and longitudinal settings
Jones, L.; Ergas, R.; Tibbs, A.; Russo, E. T.; Norville, J.; Bingay, B.; Brown, C. M.; Reich, N. G.; Pasco, R.
Show abstract
Background Pediatric immunizations for Respiratory Syncytial Virus (RSV), including monoclonal antibodies for infants and vaccines for pregnant people, have become broadly available and can prevent severe RSV outcomes in infants. However, quantifying the impact of RSV immunization in prevention of severe pediatric illness at the population-level is limited by lack of RSV case surveillance data. The Massachusetts Department of Public Health (DPH) conducted a modeling analysis using routine public health surveillance data to estimate the state-level impact of new RSV immunization products on Emergency Department (ED) visits and hospitalizations in Massachusetts for highest risk pediatric groups. Methods A scenario projection tool, called R.Scenario.Vax, was utilized to simulate RSV-associated ED hospital encounters by age group in the context of newly available immunizations. ED visit and hospitalization data from the National Syndromic Surveillance Program (NSSP) during the time period 10/08/2017--10/19/2024 were analyzed, scaled to account for changes in RSV testing practices over time and missing encounter volume in historic data, and utilized to inform model fit of a "typical" RSV season. RSV immunization data from the Massachusetts Immunization Information System (MIIS) for the 2023--2024 and 2024--2025 RSV seasons informed high and moderate pediatric RSV immunization coverage scenarios and their impact was compared to a counterfactual reference scenario of no new immunizations. Median projections were quantitatively and qualitatively compared to observed 2024--2025 season data. Percent reduction in hospital encounters and encounters averted per 10,000 population were calculated for each scenario as compared to the reference. Results Projections for the youngest at-risk age groups showed significantly lower RSV-associated ED visits and hospitalizations during the 2024--2025 season for both high and moderate immunization coverage scenarios. Median projections for infants under 6 months old in the highest coverage scenario, wherein nearly all infants were immunized, showed 72.6% lower ED visits and 73.4% lower hospitalizations when compared to the reference scenario, equating to 262 ED visits and 85 hospitalizations averted per 10,000 population. Conclusions Our results support the use of modeling methods for public health insights and suggest that RSV immunizations for infant populations result in significantly lower RSV-related ED encounters in Massachusetts.
Li, K.; Perniciaro, S.; Kwon, J.; Grubaugh, N. D.; Weinberger, D. M.; Pitzer, V. E.
Show abstract
Human metapneumovirus (HMPV) causes acute lower respiratory infections, primarily affecting young children and older adults, with seasonal outbreaks peaking annually in March or April in the United States and other temperate regions in the Northern hemisphere. However, the factors driving HMPV seasonality in the United States remain poorly understood. We analyzed laboratory-confirmed HMPV cases and age-specific emergency department visits across 10 US regions, fitting an age-stratified dynamic transmission model to assess spatiotemporal patterns and investigate the influence of environmental variables and viral interference from RSV on HMPV transmission rates. We found that models incorporating climate variables into the transmission rate, including vapor pressure, precipitation, potential evapotranspiration, and minimum temperature, could not capture the timing of HMPV activity across all regions. Instead, HMPV timing was associated with RSV activity, with the HMPV transmission rate reduced in the presence of RSV. We showed that, unlike RSV, only models incorporating viral interference could reproduce the biennial pattern of HMPV observed in some regions, characterized by alternating late-small and early-large epidemics. Furthermore, our model successfully reproduced post-COVID-19 HMPV and RSV epidemics and predicted that RSV interventions are not likely to lead to a substantial increase in HMPV activity despite decreasing competition from RSV. Our work unravels the spatiotemporal dynamics of HMPV and its interaction with RSV, informing future seasonal forecasting and intervention strategies for HMPV.
Middleton, C.; Larremore, D.
Show abstract
An ongoing outbreak of Bundibugyo virus disease (BVD) in the Democratic Republic of the Congo was deemed a public health emergency of international concern in May 2026. To prevent cross-border importation, many countries, including the United States, Canada, India, Thailand, and Kenya have already proposed containment strategies, and others are likely to follow suit. How well (or poorly) are screening and quarantine containment measures are likely to work? We leverage established epidemiological theory and develop a mathematical model of traveler screening and post-arrival quarantine for BVD to answer this question. We find that traveler screening via symptom screening or molecular testing will miss the majority of infected travelers, and should be complemented by post-arrival quarantine and monitoring of sufficient duration to detect those with long incubation periods. Our findings underscore the limitations of border screening and the importance of complementary measures like post-arrival quarantine to prevent local importation of BVD.
Hines, A. G.; Mathis, S. M.; Johansson, M. A.; Biggerstaff, M.; Reed, C.; Borchering, R.
Show abstract
Since the U.S. 2013/14 influenza season, the CDC's FluSight Challenge has provided a platform for evaluating influenza forecasting models and fostering collaboration across institutions. The Challenge aims to improve the science and enhance the utility of infectious disease forecasts for public health decision making. We analyzed ten years of submitted forecasts (2014/15-2019/20 (influenza-like illness seasons) and 2021/22-2024/25 (hospital admissions seasons)) across a range of model types, including statistical, mechanistic, machine learning, and hybrid models. Influenza-like illness (ILI) forecasts were evaluated using the exponentiated logarithmic score (skill metric) while hospital admissions forecasts were evaluated using the log transformed relative Weighted Interval Score. Corresponding potential performance differences were assessed using Wilcoxon rank-sum tests, and associations with team participation history were evaluated using Spearman's rank correlation. Model performance varied by season, and no single model type consistently outperformed others. In ILI seasons, statistical models generally performed better than mechanistic and machine learning models, though consistent differences were not observed in more recent hospital admissions seasons. Ensemble forecasts showed better overall performance across seasons, and the CDC's FluSight ensemble ranked among the top-performing forecasts every year. We also found a positive correlation between forecast accuracy and the number of years a team participated in the Challenge, with statistically significant associations in four seasons. These findings highlight the benefits of ensemble approaches and sustained engagement in improving forecasting performance, while also underscoring the continued value of forecast evaluation before and following the COVID-19 pandemic. Insights from the FluSight Challenge can guide future infectious disease forecasting efforts and support more effective public health preparedness.
Colitta, A.; Bruno, S.; Benedetti, D.; Hoxhaj, D.; Cruz-Sanabria, F.; Di Pede, C.; Buracchi Torresi, F.; Frumento, P.; Gargani, L.; Fabbrini, M.; Maestri Tassoni, M.; Bonanni, E.; Faraguna, U.
Show abstract
AIMS Cardiometabolic risk factors may impair health by altering the autonomic modulation of the cardiovascular system, a physiological process described by heart rate (HR) circadian oscillations. However, the impact of cardiometabolic health determinants on HR circadian oscillations remains scarcely characterized in real-world, population-based settings. To address this, we applied digital health technologies to investigate how cardiometabolic health determinants shape HR circadian oscillations in a real-world cohort of individuals free of cardiometabolic diseases. METHODS First, a 10-fold cross-validation of a model was performed, aiming at mitigating wearables measurement error caused by motion artifacts. This process was informed by 10,056 epochs of concurrent wearable-derived and polysomnographic HR assessment, yielding an average 1.3 bpm reduction in wearables measurement error. We subsequently applied this model to over 2 million 1-minute epochs of HR data, derived from 7-day continuous actigraphic recordings of 245 individuals free of cardiometabolic disorders. Functional-on-scalar regression modelling and both parametric and nonparametric analyses characterized HR circadian profiles and their relationships with demographics, lifestyle, chronotype, sleep health, and chronic insomnia diagnosis. A 6-dimension sleep health index was calculated. RESULTS Sex, chronotype, and sleep health predominantly shaped HR circadian oscillations. In detail, females consistently showed higher HR across the 24 hours. Moreover, chronotype was associated to a phase shift in HR circadian profiles, with later timings corresponding to eveningness. Notably, sleep health impacted HR circadian oscillations in a dose-dependent fashion: each additional impaired sleep dimension was associated with a 1.2 bpm HR increase during nighttime, alongside reduced circadian robustness and delayed oscillation timings. Finally, the earlier occurrence of morning HR peaks served as a digital biomarker of insomnia (80% specificity, 74% sensitivity). CONCLUSIONS This work provides a digital health framework to characterize HR circadian oscillations in free-living populations and supports its clinical utility in capturing the autonomic disruptions related to cardiometabolic health determinants.
Vomo-Donfack, K. L.; Bousquet, G.; Falgarone, G.; Ginot, G.; Morilla, I.
Show abstract
Whole-genome sequencing comprehensively captures coding, non-coding and structural variation in families with suspected inherited disorders, yet its clinical utility remains constrained by an interpretation bottleneck: selecting a handful of relevant variants from millions of candidates. Current rule-based pipelines, anchored in ACMG/AMP criteria, excel at identifying highly penetrant Mendelian alleles but frequently miss variants of low-to-moderate penetrance, non-coding alterations and germline-somatic interactions. Here we introduce PolyCLIP-T, a topology-guided multimodal framework that transforms variant selection from a classification problem into a geometric discovery task. By contrastively aligning DNA-sequence embeddings with functional annotations, PolyCLIP-T constructs a unified latent space in which the displacement between reference and alternate embeddings quantifies the molecular perturbation induced by each variant. Persistent homology then identifies stable topological components - coherent variant groups shared among affected relatives - that transcend single-variant scoring logic. Applied to six families with multi-morbid cancer, autoimmune and cardiovascular disease, PolyCLIP-T recovered non-coding and structural candidates overlooked by conventional pipelines and revealed pleiotropic networks spanning disease categories. This approach provides an interpretable, scalable solution for genome-first investigations of disorders driven by polygenic architectures that evade single-variant analysis. The framework was developed and benchmarked on deeply characterised familial cohorts selected for transgenerational multimorbidity; validation in larger, independent populations will be essential to establish its generalisability. An interactive web tool is freely available at https://www.polyclip-t.uma.es/.
Felici, B.; Ritchie, S. C.; Khullar, S.; Foguet, C.; Persyn, E.; Manikpurage, H. D.; Liu, Y.; Lambert, S. A.; Ip, S.; Rudd, J. H. F.; Inouye, M.
Show abstract
Cardiovascular diseases (CVDs) are highly heritable, but pathogenesis at the organ and physiological level is still poorly defined. Polygenic risk scores (PRSs), which estimate individual genetic susceptibility to a disease, may allow for the identification of associated abnormal organ structures. Ultimately, identifying where cardiovascular polygenic risk manifests can guide early interventions, shape mechanistic hypotheses, and motivate prevention trials for cardiac remodelling. This study investigated the association between PRSs for five common CVDs [heart failure (HF), coronary artery disease (CAD), atrial fibrillation (AF), abdominal aortic aneurysm (AAA) and ischaemic stroke (IS)] and 28 imaging-derived phenotypes (IDPs) from cardiac magnetic resonance imaging of ~62,000 participants in UK Biobank. To investigate the cardiac features associated with elevated polygenic risk of CVDs, we tested CVD PRSs against cardiac IDPs and identified 97 significant associations (FDR [≤] 0.05). We further identified 32 significant putative mediators between CVD PRSs and incident disease events, revealing that across CVDs, polygenic risk manifested as distinct patterns in cardiac structures. HF implicated all cardiac chambers, including left ventricular and left atrial dysfunction alongside enlarged aorta. AF was characterised by biatrial enlargement and reduced ejection fractions, most prominently in the left atrium but also involving left ventricular wall thickness. IS exhibited left ventricular hypertrophy and left atrial dysfunction, while CAD predominantly involved left ventricular hypertrophy. AAA was primarily characterised by enlarged descending aorta. Overall, cardiac IDPs mediated a substantial proportion of polygenic risk for CVDs, in particular for HF. Taken together, our results show that cardiac structure and function lie on the pathway between polygenic risk and cardiovascular events.
KESOZI Digital Twin, ; Agumba, J. O.; Namusonge, L.; Ogendo, J.; Hassan, M. A.; Pembere, A.; Takavarasha, M.
Show abstract
Childhood diarrheal disease remains a leading cause of morbidity and mortality among children under five years in sub-Saharan Africa, particularly in settings affected by inadequate sanitation, climate variability, malnutrition, and limited healthcare access. Conventional forecasting approaches are often constrained by sparse surveillance data, weak spatial representation, and limited incorporation of mechanistic disease dynamics. This study presents a Physics-Informed Multimodal Artificial Intelligence Digital Twin framework that integrates Physics-Informed Neural Networks, Graph Neural Networks, diffusion-reaction epidemiological modeling, multimodal fusion learning, and Digital Twin simulation to estimate and predict childhood diarrheal disease burden in Kenya, Somaliland, and Zimbabwe. Using public epidemiological, environmental, climate, sanitation, and synthetic proof-of-concept datasets, the framework modeled temporal disease dynamics, spatial transmission, pathogen-attributed burden, and outbreak trajectories while enforcing epidemiological consistency through physics-informed optimization. Results demonstrated robust forecasting performance, enhanced spatial transmission modeling, uncertainty-aware predictions, and realistic outbreak simulations across the three countries. Rotavirus, Shigella, and Cryptosporidium were identified as major contributors to modeled mortality burden, while unsafe water exposure, poor sanitation, malnutrition, and climate-sensitive transmission substantially increased disease risk. Compared with a Bayesian baseline model, the multimodal framework achieved superior nonlinear risk characterization, geospatial learning, and temporal prediction. These findings highlight the potential of scientific machine learning and digital twin systems for infectious disease surveillance, outbreak forecasting, climate-health analytics, and evidence-based public health decision-making in low-resource African settings. Keywords: Physics-Informed Neural Networks, Graph Neural Networks, Digital Twin, Childhood Diarrheal Disease, Epidemiology, Kenya, Somaliland, Zimbabwe, Scientific Machine Learning, Spatial Epidemiology, Multimodal Fusion
Aydogdu, D.; Gaber, F.; Sorooshmehr, A.; Akalin, A.
Show abstract
Cardiovascular diseases (CVDs) remain the primary global health burden, motivating the search for robust, non-invasive risk biomarkers. We harness a foundation model pretrained on over 10 million recordings, to evaluate ECG-derived age deviation as a cross-cohort biomarker of CVD burden. A predictive model, trained exclusively on healthy subjects, achieved accurate age prediction. Diseased subjects exhibited significant positive age acceleration across multiple categories, with structural and ischemic heart diseases showing the largest effects. External validation in a hospital-based cohort (n=160,493) confirmed that age acceleration independently predicts all-cause mortality, with the strongest prognostic value in patients under 65 years. Furthermore, we demonstrated that disease discrimination and mortality prediction are preserved across 6-lead and single-lead configurations, supporting potential deployment in wearable or mobile devices. Our analysis also revealed a striking morphological confound from the complete left bundle branch block, leading us to propose absolute age deviation as a more robust, universal risk marker. These findings establish ECG-derived biological age deviation as a highly generalizable and clinically actionable biomarker for assessing cardiovascular risk. We have also developed a web application at https://bioinformatics.mdc-berlin.de/ECGage that allows users to easily test our framework.
Kim, D.; Pasco, R.; Johnson, K. E.; Fox, S. J.; Reich, N. G.; Meyers, L. A.
Show abstract
Accurate outbreak forecasts are critical for timely and effective public health response. In the United States, however, most forecasts are produced at the state level, which can mask substantial sub-state heterogeneity and limit their utility for local planning. We generated and evaluated forecasts of the percentage of Emergency Department visits attributable to influenza across 173 large metropolitan Health Service Areas (HSAs) using a gradient boosting quantile regression (GBQR) model, and compared their accuracy to forecasts derived from state-level data alone. At a one-week, two-week and three-week horizon, local forecasts outperformed state-based forecasts in 98.8%, 90.8%, and 78.6% of HSAs, respectively, achieving mean weighted interval scores that were on average a 39.2% lower (95% range: 5.9% to 76.7%), 19.6% lower (-6.3% to 59.5%) , and 11.4% lower (-11.7% to 44.9%), respectively. The performance advantage of local forecasting was strongest in HSAs representing a smaller share of their state's population and increased with the proportion of the HSA population living in urban areas and the number of metropolitan areas within a state. These results, based on an analysis of HSAs with populations greater than 250,000, demonstrate that fine-scale modeling can substantially improve forecast accuracy and highlight the potential value of local forecasts for outbreak preparedness and response.
Pujolassos, M.; Kurilshikov, A.; Weersma, R. K.; Yang-Fu, J.; Zhernakova, A.; Calle, M. L.
Show abstract
While microbiome is increasingly recognized as crucial for human health, translating this knowledge into effective healthcare and preventive strategies remains challenging. Many studies focus on identifying changes in microbiome composition associated with disease and evaluating the potential of such disease-associated microbial profiles as biomarkers for disease diagnosis. Under the hypothesis that microbiome dysbiosis may reflect physiological alterations present long before disease onset, in this work, we analyse the potential of disease-specific microbial signatures not as a diagnostic tool when the disease is already present, but as a means of health assessment in the general population. Moreover, instead of trying to define a single health measure, we believe it is necessary to consider several ways in which the microbiome departs from health, according to different disease-related physiological changes. To evaluate our assumptions, we designed a two-stage study: the identification of disease-specific microbial signatures (discovery stage) and, subsequently, the study of their distribution in the general population to assess associations with general health (external validation stage). Specifically, in the discovery phase we characterized 16 disease-specific bacterial signatures from large public microbiome data using a compositional data analysis methodology. In the second phase, we quantified these microbial signatures in the Lifelines-DMP cohort, a large population-based cohort, and evaluated their association with self-reported health status. Results indicate that most disease-specific microbial signatures associate with health status, supporting our assumption that microbial composition can capture physiological alterations before disease onset, and highlighting the importance of considering multiple ways in which microbiome departs from a healthy state. These findings reaffirm the potential of microbial information as an additional tool in preventive medicine.
Panchumarthi, L. Y.; Kataria, S.; Wu, Y.; Hu, X.; Fedorov, A.; Kwak, H. G.
Show abstract
Background. Fairness-aware machine learning increasingly targets demographic performance disparities in clinical prediction, yet whether standard bias mitigation strategies genuinely improve equity in physiological signal analysis remains unclear. Age-based disparities in photoplethysmography (PPG)-based heart rate prediction present a particular challenge, as age-related performance differences may reflect context-dependent physiological structure rather than correctable artifacts. Methods. We evaluated three fairness interventions, inverse-frequency weighting (IF), Group Distributionally Robust Optimization (GroupDRO), and adversarial debiasing (ADV), applied via fine-tuning of a PPG foundation model across three clinical datasets spanning intensive care unit, laboratory, and consumer wearable contexts. Outcomes were assessed using a 2x2 framework classifying each intervention-dataset combination by the joint direction of change in mean absolute error (MAE) and fairness gap (FG) across age groups, yielding four outcome types: genuine improvement (G), leveling down (L), selective benefit (S), and both worse (W). Results. Across nine intra-domain conditions, no intervention simultaneously improved both MAE and FG (0/9 genuine improvement). The dominant pattern was leveling down (5/9): FG decreased but was accompanied by MAE degradation, indicating that apparent fairness gains were achieved at the cost of overall predictive performance. Age-group difficulty ordering varied across clinical contexts at baseline and was not preserved under intervention. In 18 cross-domain transfer conditions, genuine improvement was rare (4/18) and observed exclusively in non-MIMIC source configurations; models fine-tuned on MIMIC-sourced data yielded no genuine improvements (0/6). Embedding-level representation changes following fine-tuning did not reliably predict fairness outcomes. Conclusions. Age-based fairness interventions in PPG heart rate prediction indicate a leveling-down pattern rather than genuine equity improvement, suggesting that age-related performance gaps reflect context-dependent physiological structure not fully addressable through standard bias mitigation. Cross-domain transfer further amplifies this instability. These findings suggest that fairness evaluation frameworks for age-stratified physiological prediction should account for context-dependent performance structure rather than treating observed gaps as correctable bias.
Chung, R.; Chalasani, N. S.; Barbehenn, A. S.; Lundgren, E.; Savur, S.; Shome, S.; Sheikhzadeh, C. H.; Sarvadhavabhatla, S.; Donaire, M. S.; Pae, V.; Chu, X.; Winder, D.; Maguire, C. T.; Topal, S.; Ganesan, A.; Yabes, J. M.; Larson, D. T.; Lalani, T.; Ewers, E. C.; Colombo, R. E.; Dugan, E.; Rathore, U.; Marson, A.; Agan, B. K.; Tomalka, J. A.; Sekaly, R.-P.; Loannidis, N. M.; Lee, S. A.
Show abstract
People with HIV exhibit elevated inflammation and cardiovascular risk despite antiretroviral therapy. To define the genetic architecture of inflammasome-associated inflammation, we performed whole-genome sequencing and quantified plasma IL-6, IL-1{beta}, and IL-18 in 1,000 ART-suppressed PWH from the U.S. Military HIV Natural History Study. Genome-wide analyses identified 14 loci implicating antiviral defense (DDX17, DDX41, EEA1, BCL11A), lipid metabolism (ABCA1, ABCA12, ABCC1, AGMO), and vascular remodeling (KLHL29, RNF213, ETV1). Transcriptome-wide analyses across cardiovascular and immune tissues identified regulatory programs linking interferon signaling, immune activation, and vascular biology to circulating cytokine levels. Mendelian randomization analyses supported causal relationships between inflammasome-associated cytokines and vascular events. Functional integration with genome-wide CRISPR perturbation datasets in primary CD4 T cells linked cytokine-associated loci to HIV antiviral pathways and cytokine regulatory networks. External validation in cohorts without HIV demonstrated pathway-level convergence despite limited variant-level overlap. These findings define genetic mechanisms linking inflammasome signaling, antiviral defense, and cardiovascular risk.
Ernandez, J.; Najafi, A.; Roehrborn, C. G.; Lerner, L. B.
Show abstract
PURPOSE: As the armamentarium of BPH therapies continues to expand, it remains imperative to maximize patient satisfaction and minimize decisional regret. We sought to determine the impact of time from BPH diagnosis to index treatment on symptom improvement and subsequent procedural events. MATERIALS AND METHODS: We queried the American Urological Association Quality Registry for men [≥] 40 years old with BPH, available IPSS data, and no receipt of prior BPH treatment. Index treatment included medication, surgery, or minimally invasive surgical therapy (MIST). Outcomes included IPSS over 3 years of follow-up, change in percentage of mild lower urinary tract symptoms (LUTS) by 3 months, and time to procedural event. Patients were stratified by time from index diagnosis to treatment by <12 months, 1-3 years, and >3 years. Outcomes were compared across time-to-treatment cohorts with appropriate statistical tests with p < 0.05 as significant. RESULTS: 43,919 patients met criteria with 19,642 pursuing treatments. Patients pursued treatment at comparably lower baseline IPSS compared to prior prospective series. Patients undergoing surgery and MIST had significantly higher baseline IPSS, while medical comorbidities were significantly more common among men initiating pharmacotherapy. Early surgery and MIST were associated with significant improvement in IPSS within 6-12 months and an increase in mild LUTS by 3 months. All forms of early treatment were associated with delayed time to procedural events, including catheterization and fulguration. CONCLUSIONS: Early procedural intervention for BPH is associated with early symptom improvement and delayed time to procedural events among real-world, contemporary practice.
Colosi, E.; Calmon, L.; Fässli, M.; Koch, K.; Bielicki, J. A.; Colizza, V.
Show abstract
Pooled testing programs were introduced during the COVID-19 pandemic to expand surveillance capacity while preserving testing resources, but evidence on their epidemiological impact in schools under real-world conditions remains limited. We analyzed data from the pooled testing program implemented in public primary schools of the canton of Basel-Landschaft, Switzerland, during the Fall-Winter 2021 Delta wave. We used an agent-based transmission model informed by pooled and individual testing results, school characteristics, contact networks, and community incidence. The model was fitted to pooled positivity ratios in four clusters of administrative areas with similar epidemic trajectories. We compared pooled testing with alternative protocols in terms of school transmission, testing volume, and student-days lost. During the study period, pooled testing was offered to 21'187 students across 62 public primary schools, with high and stable participation across clusters (mean 71-79%). The fitted model reproduced observed pool positivity trends well. Compared with pooled testing, reactive class closure, reactive screening, and symptomatic testing were associated with higher in-school transmission, with excess ranging from 50% to 87%, 63% to 104%, and 72% to 133% across clusters. Weekly individual screening achieved similar reductions in transmission but required 15-25 times more tests. Relaxing class closure after depooling substantially reduced student-days lost without increasing transmission. Under real-world conditions, pooled testing provided an effective and resource-efficient strategy to reduce SARS-CoV-2 transmission in primary schools. Combining early detection of asymptomatic infections with low testing demands, pooled testing offers a scalable approach to school surveillance and control for pandemic response in educational settings.